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ABSTRACT 

2 

I present here a generalization of the maximum Ukehhood method and the X 
method to the cases in which the data are not assumed to be Gaussian distributed. 
The method, based on the multivariate Edgeworth expansion, can find several as- 
trophysical applications, f mention only two of them. First, in the microwave 
background analysis, where it cannot be excluded that the initial perturbations are 
non-Gaussian. Second, in the large scale structure statistics, as we already know 
that the galaxy distribution deviates from Gaussianity on the scales at which non- 
linearity is important. As a first interesting result I show here how the confidence 
regions are modified when non-Gaussianity is taken into account. 



1. Introduction 

The role of statistics in large scale astrophysics is increasing at a very fast rate, 
barely keeping the pace with the flow of observational data from galaxy surveys and 
microwave background. Since we want not only to describe our Universe but also 
to understand it, we need quantitative ways to compare observations with theoret- 
ical models. This requires the choice of good statistical descriptors, like correlation 
functions, higher-order moments and similar, and the ability to determine their con- 
fidence regions (CR), i.e. the probability density function of the estimators. The 
general problem is that, while we certainly need some basic assumptions with respect 
to the statistical nature of the data, we want to keep these assumptions to a mini- 
mum. For instance, we would like to analyze the signal from the cosmic microwave 
background (CMB) experiments or from galaxy surveys without assuming that the 
data are Gaussian distributed. 

To this scope I present here a general method to approach such problems, based 
on the Multivariate Edgeworth Expansion (MEE), an Hermite expansion around a 
Gaussian distributiontl'B. This method is suitable to the cases in which the data can be 
reasonably assumed to be mildly non-Gaussian, and we wish to estimate the region of 
confidence of the relevant parameters without the Gaussian assumption. I can think 
of several applications of the MEE to the analysis of astrophysical data. In the case 
of the CMB, we can use the MEE to estimate at the same time such fundamental 
parameters like the primordial slope n and the quadrupole amplitude Qrms^ 
higher-order parameters like the skewness, along with their confidence regions. I 
will show that the CR may broaden or contract with respect to the Gaussian case. 
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Similarly, one can analyse in the same manner the large scale structure of galaxy 
clustering. Another application is to the case of fitting when the data are not 
Gaussian. More details on the method and on its applications can be found in another 
workH. 

2. Formalism 

Let dC" be a set of experimental data (e.g., CMB fluctuations, or galaxy counts), 
i = 1, ..iV, and let us form the variables = (i* — t*, where f are the theoretically 
expected values for the measured quantities. Let be the correlation matrix 

=< x^x^ > , (1) 
and let us introduce the higher-order cumulant matrices (or correlation functions) 

k"^^ = < x'x^x^ > (skewness matrix), (2) 
k'^f'^ = < x'x^x'^x^ > -d'c^^ - c*V' - c^V'^ (kurtosis matrix) . (3) 

The correlation matrices depend in general both on a number of theoretical param- 
eters Oj, j = 1,..P and on the experimental errors. We assume the latter to be 
Gaussian distributed and completely characterized by the correlation matrix to 
be added in quadrature to the 2-point correlation function. It is useful to define 
then the matrix Xij = {d^ + d^)~^ . The parameters aj are fixed by maximizing, with 
respect to the parameters, the likelihood function L = /(x) , where /(x) is the mul- 
tivariate probability distribution function (PDF) for the random variables Xj. The 
usual simplifying assumption is then that /(x) is a multivariate Gaussian distribution 

Lg = /(x) = ^(x. A) = (27r)-^/2|^|i/2gxp(-x^A,,xV2) . (4) 

where |A| = det(Ajj). A straightforward way to generalize the LF so as to include 
the higher-order correlation functions, which embody the non-Gaussian properties of 
the data, is provided by the MEE. An unknown PDF /(x) can inde£d be expanded 
around a multivariate Gaussian G{x, A) according to the formula a'Bcl 

/(x) = Gi^,X)[l + h'^%,ki^,X) + ^k'^'%,kii^,X) + ^^^^ , (5) 

where /ijj,. are Hermite tensors, a generalization of the Hermite polynomials. If there 
are r subscripts, the Hermite tensor hij is said to be of order r, and is given by 

hi,... = {-irG'\^, A)%..G(x, A) , (6) 

where dij... = {d / dxi){d j dxj).... It can be shown that the MEE gives a good ap- 
proximation to any distribution function provided that the cumulants obey the same 
order-of-magnitude scaling of a standardized mean.a This condition is satisfied, for 



instance, by the cumulants of the galaxy clustering in the scaling regime, which ex- 
plains why the (univariate) Edgeworth exnansion well approximates the probability 
distribution of the large scale density fieldQ' Q In the past years, the MEE has been 
employed also to approximate the biased density distribution for large value of the 
biasing threshold, to the scope of calculating the peak correlation functions for non-G 
random fieldsQ and other descriptors of excursion setsB. The same expansion has been 
also applied to the statistics of pencil-beam surveys^, and to go beyond the Gaussian 
approximation in calculating the topological genus of weakly non-Gaussian fieldgi2l. 
Let us also note that the MEE can also be immediately generalized to the case of 
experimental errors not Gaussian distributed. 



3. Best estimators 



The best parameter estimates are obtained by maximizing Eq. (|^) with respect 
to the parameters. To illustrate some interesting points, let us put ourselves in the 
simplest case, in which all data are independent, and we only need to estimate the 
parameters a and ^3 defined cLSi Cij — (J Oijj 

kijk = k^Sijdjk . The maximum likelihood 
estimators for the variance and the skewness are then obtained by putting 

dhgL ^ ^ d\ogL ^ ^ 
da ' dks 

The solution reduce then to the usual sample quantities 

^'=E^'/^. k=j:^'/N. (8) 

i i 

(which are asymptotically unbiassed) . The same calculation can be carried out in the 
more general case of dipendent variables, but the search for the maximum is more 
simply performed numerically when the situation is more complicated. 

Once we have the best estimators dj(x) of our parameters, we need to estimate 
the confidence regions for that paramaters. The problem consists in determining the 
behavior of the unknown distribution P[dj(x)], when we know the distribution for 
the random variables Xj. This problem is generally unsoluble analitycally, and the 
common approach is to resort to MonteCarlo simulations of the data. However, we 
can always approximate P{ai) around its peakhy a Gaussian distribution multivariate 
in the parameter space; if the number of data — * 00, this procedure can be justified 
by the central limit theorem. The covariance matrix of the parameters is theni 



i_ 91ogL(x,«„) 



(9) 



where a, b run over the dimensionality P of the parameter space. The component 
S22, i.e. the variance of k^, is then simply (dropping the hats here and below) 



S22 = 6aVAr, 



(10) 



which, not unexpectedly, is the sample skewness variance, i.e. the scatter in the 
skewness of Gaussian samples. More interesting is the error in the variance parameter 
a when not only a non-zero skewness is present, but also a non-zero kurtosis 
parameter /C4, defined in a way similar to as kijki = k^Sijki- The result is 

Sn = (aV2iV)[l + 72/2] , (11) 

where 72 = k^/a^ is the dimensionless kurtosis. Notice that, in the mild non- 
Gaussianity condition we are assuming throughout this work, the mixed components 
T,i2 = ^21 cire negligible. The first term in (0) is the usual variance of the sample 
variance for Gaussian, independent data. The second term is due to the kurtosis 
correction: it will broaden the CR for a when ^4 is positive, and will shrink it when 
it is negative. Depending on the relative amplitude of the higher-order corrections, 
the CR for the variance can extend or reduce. It is important however to remark that 
this estimate of the confidence regions is approximated, and that it can be trusted 
only around the peak of the likelihood function. 



4. Non-Gaussian method 



If our data are distributed following the MEE, then we can measure the likelihood 
to have found our actual dataset integrating the LP over all the possible outcomes of 
our experiment. Then the relevant integral we have to deal with is 



M(xo) = / L{x,X)l[dxi 



(12) 



where the region of integration extends over all the possible data values which lie inside 
the region delimited by the actual value Xo- We can then use M{xo) for evaluating a 
CR for the parameters which enter Xq, like the quadrupole and the primordial slope in 
the case of CMB. The CR will depend parametrically on the higher-order moments; 
however, this will not provide a CR for the higher-order moments themselves. The 
method of the previous section can always be employed to yield a first approximation 
for such moments. Fixing a confidence level of 1 — e, we will consider as acceptables 
the values of the parameters for which M{xo) is larger than e/2 and smaller than 
l — e/2. The evaluation of (|l^) would require some discussion!. Here, however, I only 
state the final result: 



M(xo) = / L\{dxi = FN{xo 



2T{2 + N/2) [^4^ + 2-Xoj+a(^- 



■N -2 + 2x1- ^° 



N + 4 



(13) 



where -F/v(xo) is the usual cumulative function, and = Ci + 3c2, and Cf, = 
C3 + 3c4 + 15c5, and the coefficients Cj are formedd by summing over all the even 
diagonals of the correlation tensors k^^" and multiplying for the Edgeworth coefficients 



(1/24) for ci, C2 and (1/72) for C3, C4 and C5. Let us make some comments on Eq. (p^. 
First, the fact that M{xo) is a cumulative function provides a simple way to check 
the consistency of our assumptions: when the higher-order moments are too large, 
the MEE breaks down, M{xo) is no longer monotonic, and can decrease below zero 
or above unity. Second, let us suppose that the higher-order correlation functions are 
positive, which is the case for the galaxy clustering. Then the non-G corrections in 
Eq. (|l^) are negative for Xo ^ The fact that the corrections are negative for 
Xo S> implies that the value of Xo = Xo(^) is larger than in the purely Gaussian 
case, in the limit of e 0. Consequently, if the higher-order correlation functions 
are positive, the confidence regions are sistematically widened when the non-Gaussian 
corrections are taken into account. Finally, it is easy to write down the result in the 
particular case in which all the cumulant matrices are diagonal, i.e. for statistically 
independent variables. In this case the variables are simply equal to x^/ai, if 
(Ti = (A-)"^/^, and we can put k^^^{y) = k^^^{x)/a^ = 7i^j, and likewise k^^^^{y) = 72,4 



(skewness and kurtosis coefficients). Then, we have Ci = C3 = C4 = 0, and Eq. ([131) 
can be simplified to 

M(xo) = i^iv(xo) + GN{xo)q{Xo) , (14) 



where 



{N + 2)T{N/2) [24 
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72 



-(iV + 2) + 2x^ 



Xo 



A^ + 4 



(15) 

and where the average squared skewness, •jf = J^lii/^, and the average kurtosis, 
72 = J2l2,i/N, have been introduced. 

Let me now illustrate graphically some properties of the function M{xo) in its 
simplified version ( p!^ above. In all this section we can think of Xo depending 
monotonically on one single parameter, for instance the overall normalization A > 
of the correlation function: Xo(^) ~ x'^x^{Acij + eij)~^. We can then speak of a CR 
on Xo meaning in fact the corresponding CR on the parameter A. In the general 
case, the relation between xo ^ind its parameters can be quite more complicated. 
In Fig. la (for 71 = and = 10), I show how the function M{xo) varies with 
respect to 72. Schematically, for Xo/^ > 1; the function M{xo) decreases when 
72 > and increases in the opposite case. As anticipated, for too large a 72, M{xo) 
develops a non-monotonic behavior. The consequence of the behavior of M{xo) on the 
confidence region of xo is represented in Fig. lb, where the contour plots of the surface 
M{xo:'l2) are shown. Consider for instance the two outer contours, corresponding 
to M = .01, the leftmost, and M = .99, the rightmost. The range of Xo inside such 
confidence levels increases for increasing 72. The same is true for the other contour 
levels, although with a less remarkable trend. This behavior confirms the approximate 
result of Eq. ([TT|). As anticipated, this means that the non-G confidence regions will 
be larger and larger (if the higher moments are positive) than the corresponding 
Gaussian regions for higher and higher probability thresholds. 

The situation is qualitatively different considering 72 = and varying 71, the 
average skewness (Fig. 2, with A^ = 100). For the outer contours, delimiting levels 



Figure 1: a) Plot of M(xo) as a function of y^/N and of the dimensionless kurtosis 
72, for 7i — 0,N — 10. For 72 = we return to the usual cumulative function, b) 
Contour levels of M{xo) corresponding to M = .01, .1, .2, .3, .7, .8, .9, .01, from left to 
right. Notice how the limits for xo broaden for increasing 72. 



Figure 2: a) Same as in Fig. la, now with 72 = 0, N — 100, and varying 71. b) 
Contour levels of M{xo) for the same values as in Fig. lb. 



of 1% on both tails, the CR of xo increases for larger |7i|, with a minimum for 
the Gaussian case. For the internal contours, however, the CR actually shrinks for 
larger |7i|, being maximal at the Gaussian point. It is clear that in the general case, 
7i,72 7^ 0, the topography of the LF can be quite complicated. 

5. Conclusions 

Let us summarize the results reported here. This work is aimed at presenting 
a new analytic formalism for parametric estimation with the maximum likelihood 
method for non-Gaussian random fields. The method can be applied to a large 
class of astrophysical problems. The non-Gaussian likelihood function allows the 
determination of a full set of parameters and their joint confidence region, without 
arbitrarily fixing some of them, as long as enough non-linear terms are included in the 
expansion. The CR for all the relevant parameters can be estimated by approximating 
the distribution function for the parameter estimators around its peak by a Gaussian, 
as in Sect. 3. To overcome this level of approximation, in Sect. 4 I generalized 
the method to include non-Gaussian corrections. The most interesting result is 
then that the CR for the parameters which enter Xo systematically widened by 
the inclusion of the non-Gaussian terms, in the limit of e -—>■ 0. Two experiments 
producing incompatible results can then be brought to agreement when third and 
fourth-order cumulants are introduced. 

There are two main limitations to the method. One is that one obviously has 
to truncate the MEE to some order, and consequently the data analysis implicitly 
assumes that all the higher moments vanish. The second limitation is that the method 
is not applicable to strongly non-Gaussian field, where the MEE breaks down. This 
can be seen directly from Eq. (pTSl): for arbitrarily large constants Ci — c^ the likelihood 
integral is not positive-definite, although always converge to unity. 
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